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1. INTRODUCTION 

Broadly speaking, software systems requirements engineering (RE) is the process of discovering that 
purpose, by identifying stakeholders and their needs and documenting these in a form that is amenable 
to analysis, communication, and subsequent implementation [1]. The importance of RE is emphasized 
to develop effective software and reduce software mistakes in the early stage of software development [2]. 
Requirements modeling uses a combination of text and diagrammatic forms to depict requirements in a way 
that is relatively easy to understand, and more important, straightforward to review for correctness, 
completeness, and consistency [3]. In analyzing software requirements, after the domain is understood and 
elicited, requirements are evaluated and negotiated, then the consolidated requirements are specification 
specified and documented [4]. This requirements specification and documentation is where requirements 
modeling commonly occurs. Throughout requirements modeling, the primary focus is on what, not how, 
on iStar 2.0’s strategic dependency model, the focus is on describing the dependency relationship between 
each actor in the system, along with the intentional elements. In the requirements engineering community, 
iStar 2.0 is gaining traction both in the academical and industrial fields and is used by many players in 
the community [5]. The framework is applied and implemented in various sectors, such as healthcare, 
security analysis, and eCommerce [6]. 
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When modeling requirements and designing software products, many engineers still resort to 
drawing the diagrams manually by hand instead of using software tools. One reason could be that 
hand-drawing the diagrams could lead to more focused work and less distraction [7]. However, 
in a sustainable project with continuous revisions caused by requirements evolution, it gradually became 
apparent that the digitalization of the hand-drawn diagram is essential in an ever-evolving requirements 
engineering activities. One of the first steps in diagram digitalization is object detection and recognition. 
Object detection and recognition aim to detect and recognize every object belonging to a known class in an 
image [8]. Several pieces of research have shown the ability of the advanced neural networks in image/object 
recognition [9, 10, 11]; henceforth, this research meant to utilize neural network architecture to implement 
machine learning techniques to detect and recognize objects in the requirements diagram. In the machine 
learning field, the Region-based Convolutional Neural Network (R-CNN) architecture is a popular method 
with promising performance. The rapid growth has proposed the currently known Faster R-CNN 
(from its predecessors, the R-CNN, and the Fast R-CNN) with better accuracy and processing [12]. 
Other research also displays the potential of Faster R-CNN to detect an object in an image with high accuracy 
with the correct dataset [13]. 

Furthermore, image pre-processing also holds a vital role in processing datasets in object 
detection [14]. One standard process is the color-to-grayscale technique. Grayscale images are images with 
only have a single value for its every pixel, resulting in a grey image, which tends to be black on pixels with 
weak intensity and white on pixels with high intensity [15]. This research uses Gleam as the greyscaling 
method, as it is argued that compared to other techniques, Gleam performs better [16]. 
Furthermore, to perform upsampling of the dataset towards a high-performing model, Salt and Pepper noise 
is utilized for its ability to replicate image data with differences by inserting wrong bit transmission 
and analog to digital conversion [17]. 

This paper reports the result of the early study which aims to implement and evaluate 
the performance of Faster R-CNN, Gleam, and Salt and Pepper technique for single object detection 
and recognition in a hand-drawn iStar 2.0 strategic dependency model for requirements modeling. The 
model’s performance is measured by calculating the precision, accuracy, recall, and F-measure when 
classifying the notation of iStar 2.0 symbols. 


2. RESEARCH METHOD 

In conducting the research to implement and evaluate the performance of Faster R-CNN, Gleam, 
and Salt and Pepper technique to for single object detection and recognition in a hand-drawn iStar 2.0 
strategic dependency model for requirements modeling, the research methodologies are as follows. 

- Literature review and requirements analysis, 
- Experiment and system design, 

- System construction and coding, 

- Testing and evaluation, and 

- Research documentation. 

Firstly, literature review and requirements analysis activities are conducted to define the problem, 
then propose a solution, in this case, deciding the most suitable methods and practices. Secondly, after works 
of literature are reviewed, and problems are defined the architecture and system design is done using 
flowcharts to design the flow of the steps conducted in the object detection and recognition program and UI 
mockups for testing purposes. Then the designed system is constructed, and testing is conducted to evaluate 
the performance of the machine learning model. Lastly, all the activities conducted in the research 
is documented. 


2.1. iStar 2.0 

The 1* language was presented in the mid-nineties [18] as a goal- and actor-oriented modeling and 
reasoning framework. It consists of a modeling language along with reasoning techniques for analyzing 
created models. i1* was quickly adopted by the research community in fields such as requirements 
engineering and business mod- eling. Benefiting from its intentionally open nature, multiple extensions of 
the 1* language have been proposed (see [19, 20] for useful reviews), either by slightly redefining some 
existing constructs, by detailing some semantic issues not completely defined in the seminal proposal, 
or by proposing new constructs for specific domains. As a response to the need of balancing the framework’s 
open nature and a possible solution to the aforementioned adoption problems, the 1* research community 
started an initiative to identify a widely agreed upon set of core concepts in the 1* language. The main goal 
is to keep open the ability to tailor the framework while agreeing on the fundamental constructs, thus began 
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the work to propose an update to the framework, and to clearly distinguish this core language from its 
predecessors, it is named iStar 2.0. 


2.1.1. iStar 2.0 elements 
Actors are central to the social modeling nature of the language [6]. Actors are active, autonomous 
entities that aim at achieving their goals by exercising their know-how, in collaboration with other actors. 
Whenever distinguishing the type of actor is not relevant, either because of the scenario-at-hand 
or the modeling stage, the notion of generic actor-without specialization-can be used in the model. Actors are 
represented graphically as circles. 
Intentional elements are the things actors want. As such, they model different kinds of requirements 
and are central to the iStar 2.0 language. An intentional element appearing inside the boundary of an actor 
denotes something that is desired or wanted by that actor. The following elements are included in 
the language [6], with examples shown in Figure 1: 
Goal: a state of affairs that the actor wants to achieve and that has clear-cut criteria of achievement. 

- Quality: an attribute for which an actor desires some level of achievement. Qualities can guide the search 
for ways of achieving goals, and also serve as criteria for evaluating alternative ways of achieving goals. 

- Task: represents actions that an actor wants to be executed, usually with the purpose of achieving 
some goal. 

- Resource: A physical or informational entity that the actor requires in order to perform a task. 


Tickets Quick Pay for Credit 
booked booking tickets card 


Figure 1. iStar 2.0 intentional elements [6] 





2.2. Faster R-CNN, Gleam, and Salt and Pepper Noise 
Faster Region-based Convolutional Neural Network is an upgraded version of R-CNN with a better 
performance for object detection. Figure 2 shows the architecture of Faster R-CNN, with steps 

as follows [21]. 

- Region Proposal Network: The very fast task is to search in the given input image the spaces where there 
is a probability of location of object.The position of the object in an image can be located [22]. 
These regions where there is possibility of object is bounded by a region known as region of 
interest(ROJ). 

- Classification: The stage is to classify the regions of interest identified in the above steps into 
corresponding classes.The technique deployed here is Convolution Neural Networks(CNN). 

In the proposed approach there is rigrous process of identifying all spaces of object location in 
image.However if no regions are identified in the first stage of algorithm then there is no need to further 


go to the second step of approach [23]. 
classifier 
proposals F- 


Region Proposal Network 





feature maps 





Figure 2. Faster R-CNN Architecture [21] 
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Color-to-grayscale is the transformation of RGB channel to an grayscaled image. Grayscale is 
the condition in which an image consist only a single value for each of its pixel. Grayscaled image generally 
consists of grey, black (in pixels with weak intensity), and white (in pixels with strong intensity) [15]. 
Formula (1) is the formula to convert the RGB channel in a pixel into a single value ranging from 0-255 
(grayscale) [16], where the R’, G’, and B’ are get from the RGB channels which are gamma corrected using 
Formula (2). Figure 3 shows the result of a grayscaling process using Gleam. 


Gleam = =(R' + G' + B’) (1) 


r(t) =t =t! (2) 


Figure 3. Example of grayscaling using Gleam 


Salt and Pepper noise is used for replicating images in the dataset for training the model by applying 
noise in the original image. It does so by changing pixel value into the minimum or maximum value 
accepted [24, 25]. Figure 4 below shows the result of when we apply the noise into an image. 





Figure 4. Application of Salt and Pepper Noise on a Hand-Drawn Task Object in iStar 2.0s 


2.3. Requirements modeling tools 

Several researches have already emphasized the importance 1* framework [18] for modeling 
and documenting requirements, including the newly-standardized iStar 2.0 [6, 19, 20]. On previous 
researches, the proposal of integrating several requirements modeling framework and notation, including the 
early 1* framework is conducted and showed the potential of using 1* as a tool to model stakeholder 
dependency in analyzing early-phase requirements [26, 27]. Another research recognized the need of a tool 
for drawing and editing iStar 2.0 diagrams, then developed the piStar tool for supporting the creation of the 
requirements model [28]. Other researches proposed extensions to the iStar 2.0 [29] and prototype for 
generating meaningful layout [30]. However, the topic on digitalization and the use of machine learning 
architecture for object detection on iStar diagrams is still rare to be found. This research aims to address the 
missing topic by reviewing its importance and kickstarting the development of such tool. 


2.4. Single object detection and recognition for iStar 2.0 

Using the architecture provided by the Faster R-CNN technique, grayscaling using Gleam, 
and upsampling the dataset by replicating the image using Salt and Pepper, the program is then designed. 
Figure 5 shows the flow in which the training activity is done to built the machine learning model which will 
be used to detect and recognize objects. At the beginning, 600 image data are collected as the dataset, 
consisting of the drawings of 5 objects in the iStar 2.0 notation, goal, quality, actor, task, and resource. 
After the dataset is collected, labelling is done for each object, resulting in an XML file containing all 
the images and their labels. The generated XML is then converted to a CSV fille which then is used to train 
the machine learning model by running the export inference graph. These actions are done by utilizing 
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the TensorFlow Object Detection API. The resulting model is then tested and measured to find out 
the performance, based on a confusion matrix to calculate accuracy, recall, precision and the F-1 score 
of the model. Several configurations are tested to find the best scenario. The results are described in the next 
section of this paper. 

Moreover, besides designing the training activity along with all its processes, the flow 
of the program feature which will detect and recognize submitted and unlabeled image data is also designed, 
as shown in Figure 6. The flow starts by getting the submitted image file of a hand-drawn iStar 2.0 object, 
then its pixels are converted and grayscaled (using Gleam). Furthermore, path initialization is done so that 
the developed object detection API knows the exact path of the file. NUM_CLASSES describes the number 
of existing classes in which an object will be classified to. An object is considered belonging to a class when 
it achieves a score bigger than 0.9, if there are more than one class that achieves 0.9, then the first identified 
class is considered as the correct class. 


Export Inference 
Graph 


Copy XML 


Reformat 
Content XML 


Convert XML to CSV 


Dataset 


Color-to- 
grayscale 
Gleam 


| Salt and Pepper i 


Generate TF Record 





Trainma Mi | 
Noise raining Mode 





Figure 5. Flowchart training model 


Load the label map and initialize Perform the actual detection by running 


initialize path categories from label map the model with the image as input 


Initialize labelbel from 
object detected with 
threshold of score > 

0.90 


Get file Initialize Model Name ma shag 
J- 


intialize image Initialize Define input and Return object 
variable from file Path To Mode! output tensors detected of index 0 


initialize scores and 
classes of result from 
image 


Convert Image RGB initialize 
to GBR Path To Label 


Color-to-grayscale initialize Initialize number of 
Gleam NUM_CLASSES = 5 object detected 





Figure 6. Flowchart iStar 2.0 object detection and recognition 
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3. RESULTS AND ANALYSIS 

In order to measure the model’s performance and evaluate its potential to be further developed as 
a support tool for requirements modeling, seven test scenarios were designed and experiments are conducted 
to find the best condition from all scenarios to build a high-performing model. 


3.1. Test Scenarios 

Various test scenarios are prepared by using various learning rate, feature extractor, initial crop size, 
maxpool kernel, and maxpool stride. The details of the seven test scenarios experimented can be seen in 
Table 1. From those seven scenarios, the model’s ability to detect and recognize objects is then measured 
based on the confusion matrix result, calculating their accuracy, precision, recall, and F-1 Score. 


Table 1. Test scenarios 


Feature Extractor Anchor Generator 
First Initial 
Scenario ee T stage crop sa ee Height Width 
oe ype features size SN P Stride Stride 
stride 
1 0.0003 faster_rcnn_inception_resnet_v2 8 17 1 1 8 8 
2 0.0002 faster_rcnn_inception _v2 16 14 2 2 16 16 
3 0.0002 faster_rcnn_inception_resnet_v2 16 14 2 2 16 16 
4 0.0001 faster_rcnn_inception_v2 16 14 2 2 16 16 
5 0.0003 faster_rcnn_inception_v2 8 17 1 1 8 8 
6 0.0003 faster_rcnn_inception_resnet_v2 8 14 2 2 8 8 
7 0.0002 faster_rcnn_inception _v2 16 17 1 1 16 16 


3.2. Result 

After experiments are conducted based on the various configuration described in the previous 
section, test results are as described in Table 2. The highest performing scenario is found on the fourth 
scenario, using learning rate 0.0001, feature extractor type faster_rcnn_inception_v2 with 16 first stage 
features stride, 14 initial crop size, 2 maxpool kernel and stride, and 16 height and width stride, resulting in 
an average of 94% accuracy, 95% precision, 100% recall, and 97.2% F1-Score for each class. 


Table 2. Test results 


Scenario Average 

Accuracy Precision Recall Fl-score 
1 57% 63% 97% 75,05 % 
2 94% 94% 100% 96,87% 
3 42% 52% 95% 65,51% 
4 95% 95% 100% 97,20% 
5 94% 95% 97% 95,91% 
6 39% 48% 96% 60,92% 
7 88% 91% 99% 94,36% 


From the results, it can be seen that the role of feature extractor, especially if we examine 
Scenario 1, 2, and 3, where feature extractor of type faster_rcnn_extractor_v2 performs much better than 
the other. Furthermore, initial crop size also proves to be quite impactful looking at Scenario 2 and 7. 
The learning rate can also be seen to be a determining factor even though it might not result in a big gap, 
between Scenario 4 and 2, for example. 

In addition to the machine learning model as described above, our research also developed a simple 
web-based tool as the interface for users to demonstrate the model’s ability to detect and recognize uploaded 
hand-drawn iStar 2.0 notions. Figure 7 shows the sample image that will be detected and recognized. 

The notion depicted in Figure 7 is the Task in the iStar 2.0 strategic dependency model. 
Figure 8 displays the home page in which the Choose File feature can be clicked to upload the sample image, 
then when the Check Result button is clicked, the software will pre-process the image, detect the object, 
and determine if it is Task, Resource, Quality, Goal, or Actors, along with its match rate. Figure 9 shows 
the result, where the software is seen to be able to guess correctly what notion the uploaded image 
depicted, Task. 
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Figure 7. Sample Image for Object Detection 


Check Result 





Figure 8. User interface of the web-based testing application 


Result 


The Result is 99% Task 





Figure 9. User interface when the application displays the object detection result 
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4. CONCLUSION 

This research utilized Faster R-CNN using a dataset comprising of hand-drawn iStar 2.0 objects 
such as Generic Actor, Task, Resource, Quality, and Goal. Images are first pre-processed and replicated 
using Gleam for its color-to-grayscale technique and Salt and Pepper noise to give noise to the original 
dataset and duplicate the number of images in the dataset. The resulting program is best performing using 
0.0001 learning rate, feature extractor type faster_rcnn_inception_v2 with 16 first stage features stride, 
14 initial crop size, 2 maxpool kernel and stride, and 16 height and width stride, resulting in an average 
of 94% accuracy, 95% precision, 100% recall, and 97.2% F1-Score for each class. The conducted research 
displays the potential of Faster-RCNN, Gleam, and Salt and Pepper to build a model for detecting 
and recognizing objects drawn using the iStar 2.0 to enable the digitalization requirements diagram to support 
the requirements modeling activity in software development. Future works include improving the dataset and 
machine learning model to be able to digitalize a whole iStar 2.0 diagram, enabling the multi-object 
detection, and developing tools for editing and creating the whole diagram using the iStar 2.0 and other 
notation for requirements modeling. Optical character recognition techniques can also be integrated 
to be able to read texts inside the drawn objects. 
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